Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures

نویسندگان

  • Gabriel Antoniu
  • Alexandru Costan
  • Julien Bigot
  • Frédéric Desprez
  • Gilles Fedak
  • Sylvain Gault
  • Christian Pérez
  • Anthony Simonet
  • Bing Tang
  • Christophe Blanchet
  • Raphael Terreux
  • Luc Bougé
  • François Briant
  • Franck Cappello
  • Katarzyna Keahey
  • Bogdan Nicolae
  • Frédéric Suter
چکیده

As Map-Reduce emerges as a leading programming paradigm for data-intensive computing, today’s frameworks which support it still have substantial shortcomings that limit its potential scalability. In this paper we discuss several directions where there is room for such progress: they concern storage efficiency under aCorresponding author bINRIA Research Center, Rennes – Bretagne Atlantique, Rennes, France cINRIA Research Center, Grenoble Rhône – Alpes, Lyon, France dCNRS/Université Lyon 1, Institut de Biologie et Chimie des Protéines, Lyon, France eENS Cachan – Antenne de Bretagne, Rennes, France fIBM Products and Solutions Support Center, Montpellier, France gJoint INRIA-UIUC Laboratory for Petascale Computing, Urbana-Champaign, USA hArgonne National Laboratory, Argonne, USA iCNRS, CC IN2P3, Lyon, France jIBM Research, Dublin, Ireland Copyright c © 2012 Inderscience Enterprises Ltd.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures

Data: • Massive, unstructured data objects (Terabytes) • Many data objects (10³-10) ⁶ • High concurrency (10³ concurrent clients) • Fine-grain access (Megabytes) Applications: • Map-Reduce-based data-analysis applications • Governmental and commercial statistics • Data-intensive HPC simulations • Checkpointing for massively parallel computations Platforms:

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

BlobSeer: Next-generation data management for large scale infrastructures

As data volumes increase at a high speed in more and more application fields of science, engineering, information services, etc., the challenges posed by data-intensive computing gain an increasing importance. The emergence of highly scalable infrastructures, e.g. for cloud computing and for petascale computing and beyond introduces additional issues for which scalable data management becomes a...

متن کامل

An Efficient Secret Sharing-based Storage System for Cloud-based Internet of Things

Internet of things (IoTs) is the newfound information architecture based on the internet that develops interactions between objects and services in a secure and reliable environment. As the availability of many smart devices rises, secure and scalable mass storage systems for aggregate data is required in IoTs applications. In this paper, we propose a new method for storing aggregate data in Io...

متن کامل

Energy Aware Resource Management of Cloud Data Centers

Cloud Computing, the long-held dream of computing as a utility, has the potential to transform a large part of the IT industry, making software even more attractive as a service and shaping the way IT hardware is designed and purchased. Virtualization technology forms a key concept for new cloud computing architectures. The data centers are used to provide cloud services burdening a significant...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJCC

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2013